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Abstract. Detection and elimination of redundant clauses from propositional 
formulas in Conjunctive Normal Form (CNF) is a fundamental problem with nu- 
merous application domains, including AI, and has been the subject of extensive 
research. Moreover, a number of recent applications motivated various extensions 
of this problem. For example, unsatisfiable formulas partitioned into disjoint sub- 
sets of clauses (so-called groups) often need to be simplified by removing re- 
dundant groups, or may contain redundant variables, rather than clauses. In this 
report we present a generalized theoretical framework of labelled CNF formulas 
that unifies various extensions of the redundancy detection and removal prob- 
lem and allows to derive a number of results that subsume and extend previous 
work. The follow-up reports contain a number of additional theoretical results 
and algorithms for various computational problems in the context of the proposed 
framework. 



1 Introduction 

Propositional logic formulas in Conjunctive Normal Form (CNF) often have redundant 
clauses. In some contexts, redundancy is desirable. For example, the identification of 
redundant clauses is a hallmark of modern SAT solvers (30). In other contexts, redun- 
dancy is undesirable. For example, elimination of redundant clauses is useful in sim- 
plifying knowledge bases l24l . A special case of redundancy deals with unsatisfiable 
subformulas, since the identification of Minimal Unsatisfiable Subformulas (MUSes) 
finds a wide range of practical applications. 

Redundancy in logic has been extensively studied in the recent past P8124I15I25I26I . 
and includes complexity characterizations of different computational problems. Sim- 
ilarly, the specific case of unsatisfiable subformulas has also been extensively stud- 
ied 11912212111 . Computational problems of interest include computing a minimal un- 
satisfiable subformula, or enumerating them all, and computing an irredundant (or min- 
imal equivalent) subformula, or enumerating them all. Some of these problems have 
been studied in detail for the case where minimality is expressed in terms of clauses. 
Moreover, and also for the case where minimality is expressed in terms of clauses, 
well-known hitting set properties relating minimal unsatisfiable and maximal satisfi- 
able subformulas have been developed for unsatisfiable formulas 032171191 . Recently, 
this work has been extended to the case of satisfiable formulas 1211. 
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Motivated by practical applications, the extraction of MUSes has recently been gen- 
eralized to groups of (related) clauses [27 31 j, and to variables 11 111 21 131 141 . In many 
settings 11271311 , it is important to aggregate related clauses (as groups of clauses). In 
these cases, MUSes need to be expressed in terms of groups of clauses and not in terms 
of individual clauses. Clearly, MUS problems over groups of clauses or over variables 
can be extended to the more general case of redundancy removal. For example, one 
may want to compute a subformula that has no redundant variables, or a subformula 
that has no redundant groups of clauses. Also relevant are enumeration problems for 
unsatisfiability and redundancy problems when these problems are expressed in terms 
of variables or groups of clauses. For example, one may want to enumerate all the 
variable MUSes of a formula, or all the irredundant subformulas when a problem is 
represented as groups of (related) clauses. 

The main objective of this report is to develop a theoretical framework that provides 
a unified approach for tackling redundancy problems in CNF formulas, and includes 
unsatisfiable formulas as a special case. This framework enables the generalization of 
known theoretical results, but also serves to highlight how existing algorithms for dif- 
ferent computational problems can be adapted and extended 129I5I3II . The framework 
is based on the concept of labelled CNF formula, where labels are used to associate 
individual clauses of a CNF formula with disjoint groups of clauses, or with variables, 
or with literals, or even with arbitrary intersecting groups of clauses. By extending to 
the labelled CNF setting the standard definitions of MUSes and MSSes over clauses, 
the report shows that well-known properties of hitting set duality [32 19 7 | also hold 
for the general case of unsatisfiable labelled CNF formulas, and so hold for MUS and 
MSS problems over variables, literals or arbitrary groups of clauses. More interestingly, 
these results also hold for redundancy removal problems for satisfiable formulas, when 
defined over clauses, variables, or groups of clauses. The immediate consequences of 
these results include the ability to enumerate MSSes and MUSes of labelled CNF for- 
mulas, their extensions to the redundancy removal case, but also the ability to generalize 
existing MUS extraction algorithms. A detailed description of the report's contributions 
is included in Section|2]and summarized in Table 12. ll 

2 Background and Motivation 

We focus on formulas in CNF (formulas, from hence on), which we treat as finite multi- 
sets of clauses. We assume that clauses do not contain duplicate variables. Given a 
formula T we denote the set of variables that occur in T by Var(T), and the set of 
variables that occur in a clause c e T by Var(c). An assignment r for J 7 is a map 
r : Var(F) — * {0, 1}. Assignments are extended to clauses and formulas according to 
the semantics of classical propositional logic. If t(J 7 ) = 1, then r is a model of T . If a 
formula T has (resp. does not have) a model, then T is satisfiable (resp. unsatisfiable). 
By SAT (resp. UNSAT) we denote the set of all satisfiable (resp. unsatisfiable) CNF 
formulas. Formula T\ implies formula Ti (T\ 1= Ti) if every model of T\ is a model 
of .F2. T\ is equivalent to (T\ = T-i) if they have the same set of models. A clause 
c e T is redundant in T if J\\c\ = jF, or, equivalently, T\{c\ |= {c}. Formulas with 
(resp. without) redundant clauses are called redundant (resp. irredundant). 



The majority of the research on redundancy in propositional logic addresses unsat- 
isfiable CNF formulas. Irredundant unsatisfiable formulas are called minimally unsatis- 
fiable (MU). Explicitly, a formula T is MU if (i) T e UNSAT, and (ii) for any clause 
ce J, F\{c} e SAT. A subformula T' c: T is a minimally unsatisfiable subformula 
(MUS) of T if T' is minimally unsatisfiable. The set of all MUSes of T is denoted by 
MUS(J r ) — in general, a given unsatisfiable T may have more than one MUS. MUSes 
are of interest for a number of reasons, and have been on the radar of AI community 
for a long time. For example, in early work of Reiter on model-based diagnosis |32l , 
MUSes, under the name of minimal conflict sets, are used in computation of a faulty set 
of components of mis-behaving systems. More recently, MUSes find numerous appli- 
cations in formal verification of hardware and software systems, product configuration, 
etc. — see |28ll for concrete examples. Motivated by several applications, minimal un- 
satisfiability and related concepts have been extended to CNF formulas where clauses 
are partitioned into disjoint sets called groups 12713 II . 

Definition 1 (Group-Oriented MUS). Given an explicitly partitioned unsatisfiable 
CNF formula T = Go u • • ■ u Q n , a g rou P oriented MUS (or, group-MUS) of T is 
a set of groups {Gi t , . . . , Gi k }, ij > 0, such that T' = Go u Gn u ■ • • u Gi k e UNSAT, 
and for every 1 < j < k, J-'\Gij e SAT. 

Note the special role of group Go (group-0) — this group consists of "background" 
clauses that are included in every group-MUS; because of group-0 a group-MUS, as 
opposed to MUS, can be empty. In addition to clauses and groups of clauses, minimal 
unsatisfiability has been defined and analysed in terms of the variables of the formula 
111 11141 . Given a CNF formula T, and V £ Var(J-), the subformula of T induced 
by V is the formula T\v = {c e J Var(c) c V}. Then, F is variable minimally 
unsatisfiable (VMU) if J 7 e UNSAT, and for any V c Var(F), T\v e SAT, i.e. no 
variable can be removed from the formula without making it satisfiable. Here "removal 
of a variable" means removal of all clauses that have this variable. Variable MUSes 
(VMUSes) are defined accordingly: V Q Var(T) is a VMUS of J" if T\ v is VMU. In 
O variable minimal unsatisfiability has been extended in a number of ways akin to the 
extension of MUSes with group-MUSes. 

A notion dual to minimal unsatisfiability is that of maximal satisfiability: a sub- 
formula T' E T is a maximally satisfiable subformula (MSS) of J 7 if J 7 ' e SAT and 
Vc e J\F, F u {c} e UNSAT. The set of MSSes of a CNF formula T is denoted by 
MSS(J r ). MSSes are also of much interest in the context of AI. For once, given that an 
MSS constitutes a maximally consistent part of an inconsistent (i.e. unsatisfiable) for- 
mula, MSSes can be used for reasoning in the presence of inconsistency — see Q for 
an example of an MSS-based framework for reasoning with inconsistent knowledge. 
Furthermore, an MSS of maximum cardinality constitutes a set of clauses satisfied by a 
solution to the Maximum Satisfiability (MaxSAT) problem: given a formula T find an 
assignment that satisfies the maximum number of clauses of T. 

Given an MSS S of T, one may also consider a subformula T\S of J 7 — such 
subformula is called a co-MSS of T , and the set of all co-MSSes of T is denoted by 
coMSS(J'). Note that when T e UNSAT, a co-MSS of J" is a minimal subformula 
of T, removal of which from T will regain its satisfiability. Thus, for example, in the 
context of Reiter's model-based diagnosis framework 11321 . co-MSSes constitute the 



Table 2.1. Summary of existing work on redundancy in CNF formulas. The framework of la- 
belled CNF formulas proposed in this report allows to "cover" all the empty entries. 
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Hitting Set Theorem 
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[32 7 9 20 2| 






SAT 


[21| 






MaxSAT (algorithms) 


[23 1 n\ 


[181 





minimal set of components of the faulty system that must be removed to restore its 
correct behaviour, i.e. the minimal diagnosis. For a similar reason, in ll27ll the authors 
refer to co-MSSes are minimal correction subsets (MCSes). 

The MUSes, MSSes and co-MSSes of a given unsatisfiable formula T are connected 
via so-called hitting sets duality theorem. This theorem has been proved and re-proved 
on a number of occasions, starting with (32), and later in M7I9I2I27I . The connection is 
expressed in terms of irreducible hitting sets. 

Definition 2 ((Irreducible) Hitting Set). Let 5? be a collection of arbitrary sets. A 
set H is called a hitting set of 5? if for all S e S^, H n S 0- A hitting set H is 
irreducible, if no H' a H is a hitting set of 5f. 

Then, the hitting set duality theorem states that every MUS of a formula T is an irre- 
ducible hitting set of the set of co-MSSes of T, and vice versa. 

Theorem 1 (cf. 1 32 7.9. 2|). For any unsatisfiable CNF formula T: (i) formula Ai is a 
co-MSS of T if and only if M. is an irreducible hitting set o/MUS(J r ); (ii) formula U 
is an MUS of J- if and only iflA is an irreducible hitting set o/coMSS(J r ). 

Besides exposing an interesting connection between the various subformulas of CNF 
formulas, hitting set duality is used in algorithms for computation of the set of all 
MUSes of CNF formulas — see, for example, M2I27II . 

The case of redundancy in satisfiable CNF formulas has also been analysed exten- 
sively, for example in [24 2 2121ft . Here the first object of interest is a subformula of a 
CNF formula T that is irredundant and equivalent to J 7 — such subformulas are called 
minimal equivalent subformulas (MESes): a subformula T' T is an MES of T if 
P = T, and Vc e J 7 ', T'\{c} # I. The set of all MESes of T is denoted by MES (J 7 ). 
A number of efficient algorithms for computation of MESes have recently been pro- 
posed in |4|. The dual notion is that of a maximal non- equivalent subformula (MNS): 
a subformula J'c Ji s an MNS of T if T' # I and Vc e J\T' , T' u {c} es T. 
The set of MNSes of a CNF formula T is denoted by MNS (J 7 ). Finally, a subformula 
of T that is a complement of some MNS of T is called a co-MNS of J-, and the set 
of all co-MNSes of T is denoted by coMNS(J r ). Note that, as opposed to the case of 
unsatisfiable formulas, to our knowledge no extensions of MESes and related concepts, 
to groups of clauses or to the variables of CNF formulas have been proposed. 

Table 12. ll summ arizes existing work on redundancy over clauses, groups of clauses 
and variables. A number of concrete problems and properties can be considered, namely 



minimal unsatisfiability, irredundant (or minimal equivalent) subformulas, hitting set 
duality theorem and maximum satisfiability. The table shows references for overviews 
or key references for each topic. In the next section we describe a framework of so- 
called labelled CNF formulas. This framework serves to generalize all of the existing 
work described above, and, in particular, allows to "cover" all of the empty entries in 
the table. We demonstrate the usefulness of the framework by deriving a generalized 
version of the hitting set duality theorem. As a by-product we extend the recent re- 
sults on irredundant formulas for the case of satisfiable formulas ||2TI . In addition to the 
problems shown in Table 12.11 the framework of labelled CNFs allows addressing re- 
dundancy problems over literals, wire-MUSes for Boolean circuits @, and interesting 
variables MUS problem J3). 

3 Generalized Redundancy 
3.1 Labelled CNF Formulas 

The key observation that motivates the development of the labelled CNF framework is 
that in all cases described in Section|2]below, the redundancy in a CNF formula T can 
be analyzed in terms of possibly intersecting (i.e. not necessarily disjoint) subsets of 
clauses of T. An additional feature of some of the cases, for example group-MUS, is 
the presence of the background, or group-0, clauses. We capture the semantics of the 
intersecting and the background subsets of clauses in the following way. 

Definition 3 (Labelled CNF Formula). Let Lbl be a non-empty set of clause labels. 
A labelled CNF (LCNF) formula <P is a tuple (J 7 , A), where T is a CNF formula, and 
A : T —* 2 Lbl is a (total) labelling function such that for all c e T, A(c) is finite. 

We refer to the formula J 7 as a CNF part of and denote it by T<p. The labelling 
function A of <P is denoted by A#. The set of labels A<g (c) for c e T$ is referred to 
as a set of clause labels of c in <P. For I e Lbl, we refer to the set of clauses J 7 !, = 
{c E T,p I I e \<p(c)} as the set of clauses labelled with I. The role of labels in LCNF 
formulas is to group the clauses of the CNF part into subsets — these subsets can be 
disjoint, as, for example, in group-CNF context M27I31I . or intersecting, as in the context 
of variable-MUS problem 0111141 . By jf we denote the set {c e T$ A<p(c) = 0} of 
unlabelled clauses. These clauses play the role of group-0 clauses in group-CNFs, or 
uninteresting variables in the extensions of variable-MUS problem |3j. The subscripts 
for the CNF part and the labelling function of <P may be omitted when <P is understood 
from the context. With a slight abuse of notation, by A(<£) we denote the set of active 
labels of <P, that is the set UceJ 7 * -M c )- Note that \(<P) is finite, and may be empty. 
Some natural examples of labelling functions and labelled CNFs will be given shortly. 
The (un)satisfiability, models, and all related concepts of propositional logic are defined 
for labelled CNFs with respect to their CNF part. For example, ^ is unsatisfiable (<P e 
UNSAT), if e UNSAT. 

Definition 4 (Induced subformula). Let & = (J 7 , A) be a labelled CNF formula, and 
let L E A(<£). Then, the subformula of <P induced by L, is a labelled CNF formula 
<P\ L = (T\ L ,X), where T\ L = {ceT | A(c) c L). 



In other words, <P\i has the same labelling function A as <I>, however the CNF part of 
contains only those labelled clauses of T all of whose labels are included in L 
and all the unlabelled clauses T, i.e. A(^|l) £ L. Alternatively, any clause that has 
some label outside of L is removed from T. Thus, it will be convenient to speak of an 
operation of removal of a label from <P = (J 7 , A). Let I e A(<£) be any (active) label, 
then, the LCNF formula (J^J 71 , A) will be said to be obtained by the removal of label I 
from <P. Note that Definition[4]implies that for any L c A(<?) (note the strict inclusion), 
we have J-$\ L c: Also, note that it is possible that A(^|^) ez L — for example, if 
for some I e \(<P)\L, and some I' 6 L, T 1 ' c p, then I' $ A(<2>| L ). 

Example 1. Let Lbl = N, and let <P = ({ci, . . . , eg}, A) with the clauses c, and the la- 
belling function A defined as follows (the sets of clause labels are shown as subscripts). 

ci = (^V){i) c 3 = (z v £){!} c 5 = (x v y v z) c 7 = (^y v t) {3} 

C2 = (y v c 4 = (^x) {1 ,2} c 6 = (-a; v y){ 2 ,3} c 8 = 

The set of active labels of <P is A(<£) = {1,2,3,4}. $ is satisfiable, with the (only) 
model {-■X, — <y, z, — The subformula of induced by the set of labels L = {2, 3, 4} 
is <P\l = ({c5, . . . , cs}, A). Additional examples of induced subformulas are ^|{i,4} = 
<{ci, c 2 , c 3 , c 5 , c 8 }, A> and <?| = <{c 5 }, A>. 

In the context of redundancy removal in CNF formulas, we speak of redundant 
clauses, and the basic, atomic, operation on CNF formulas consists of a removal of a 
single clause from the formula. For the general case of labelled CNF formulas the oper- 
ation of removal of a single clause is not permitted — instead, the atomic modification 
to labelled CNFs is a removal of a single (active) label, that is all clauses in the CNF 
part of the formula that are labelled with this label. This is an essential point of the 
framework proposed in this report. In fact, when we speak of (proper) subformulas of 
labelled CNF formulas, we always mean "subformulas obtained by removal of labels", 
or to be precise: ^' is a subformula of <P, if <P' = <?|^ for some L c A(<£). When 
the inclusion is strict, i.e. L c A(<£), <P' is a proper subformula of <P. We will use set 
notation to denote subformula relation, e.g. <P' a <P. Note that all subformulas of <P 
have the same set of unlabelled clauses. Finally, we point out that while <P' c <p implies 
T& <S T<p, the fact that T' ^ T does necessarily imply {J 7 ', A) c A) — again, 
because removal of a single clause is, in general, not allowed in LCNFs. 

3.2 Redundancy in Labelled CNFs 

It is not difficult to see that, similar to the case of (plain) CNF, removal of labels from 
labelled CNF formula can never reduce the set of models of the formulas, that is, when 
<P' is a subformula of <P, we always have <P \= <P'. However, as with CNFs, removal of 
some labels from <P, might not affect the set of models of <P at all — such labels are 
then redundant, i.e. all clauses that are labelled with such labels can be removed from 
the formula while preserving the logical equivalence. 

Definition 5 (Redundant label; Redundant LCNF). Let <P = (J 7 , A) be a labelled 
CNF formula. A label I e A(^) is redundant in <S> if '^|a(<2>)\{(} = & A formula $ is 
redundant if \(<P) contains redundant labels. 



Alternatively, a label I e \(<P) is redundant in = (J 7 , A> if (J^J 71 ) N J 7 '. An 
irredundant LCNF has the property that the removal of any label from it extends the set 
of its models — when the formula is unsatisfiable, this means that the removal of any 
label makes it satisfiable, i.e. it is minimally unsatisfiable. 

Definition 6 (Minimally Unsatifiable LCNF). A labelled CNF formula $ = (J 7 , A> 
is minimally unsatisfiable if<P e UNSAT, and for any L cz A(<£), <P\ L e SAT. 

The following example demonstrates a number of natural definitions of labelling func- 
tions under which redundant labels capture some well-known notions of redundancy 
(cf. Section|2]). 

Example 2. Let T be any CNF formula. 

(i) Take A to be such that each clause of J 7 is labelled with a single distinct label. Then 
a label I is redundant in <P = (J 7 , A) if and only if the (only) clause labelled with I 
is redundant, in the plain CNF sense, in T. 

(ii) Take A to be such that each clause of J 7 is either labelled with a single, but not 
necessarily distinct label, or unlabelled. Then a label I is redundant in <P = (_F, A) 
if and only if the set of clauses T l is redundant, and so we capture the seman- 
tics of redundant groups in the group-CNF formulas. The unlabelled clauses 
correspond to group-0. 

(iii) Take Lbl = Var('F), and A(c) = Var(c) for each c e T. Then, a label v is 
redundant in <P = (J 7 , A) if and only if the variable v is redundant in J 7 . Thus, 
when <P is minimally unsatisfiable, J 7 is variable minimally unsatisfiable (VMU). 

As with the case of CNF, by iteratively removing redundant labels from LCNF <P we 
can obtain a subformula <P' of <P that is equivalent to <P and irredundant. Thus, the sub- 
formula <P' is a labelled CNF analog of an MES for (plain) CNF formulas (cf. Section|2]i. 
However, in our framework we chose to define labelled MESes in terms of subsets of 
labels, rather than subformulas. We argue that this definition is more natural. Consider, 
for example, the case of variable-MUSes (VMUSes). Here, VMUS is a subset minimal 
set of variables of an unsatisfiable CNF formula, rather than the subformula induced by 
these variables. If variables are used as labels of clauses in the LCNF framework, as in 
Example|2jiii), then it is indeed the subset of labels of the formula that we are interested 
in, and not the subformula itself. 

Definition 7 (Labelled Minimal Equivalent Subset (LMES)). Let <P = (J 7 , A> be a 

labelled CNF formula. A set of labels L E A(<£) is a labelled minimal equivalent subset 
(LMES) of<P, if$\ L = $, and VI/ C L, $\ L , # <P. The set of all LMESes of $ is 
denoted by LMES(^). 

As with (plain) CNF formulas, when <P is unsatisfiable, LMESes of <P capture the gen- 
eralized notion of minimally unsatisfiable subformulas. 

Definition 8 (Labelled Minimal Unsatisfiable Subset (LMUS)). Let $ = (J 7 , A> be 

a labelled CNF formula. A set of labels L £ A(^) is a labelled minimal unsatisfiable 
subset (LMUS) of <P, if<P\ L e UNSAT, and ML 1 c L, <P\ L , e SAT. The set of all 
LMUSes of$ is denoted by LMUS(^). 



Table 3.1. Summary of the corner cases for CNF and LCNF formulas. Here T refers to CNF 
formula, <P to LCNF. 





Exists for every formula ? 


Can be empty formula 


Can be the whole formula 
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yes, only when J- = 
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when ^ and 








all labels are redundant 




MNS 


no: when T = 
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no 


LMNS 


no: when A($) = 0, or 


yes 


no 
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all labels are redundant 






coMNS 


same as MNS 


no 


yes 


coLMNS 


same as LMNS 


no 


yes 



To put the above definitions into a concrete context, consider the labelled CNFs dis- 
cussed in Example |2| for the case [%) the LMESes correspond to CNF-based MESes 
and LMUSes correspond to MUSes; for the case (it) the LMUSes correspond to group- 
MUSes; for the case (Hi) the LMUSes correspond to variable-MUSes (VMUSes). 

Note that, by definition, when a label I is irredundant in <P, every LMES of <P must 
include I, and, in fact, the set of all irredundant labels of <P is precisely f] LMES(^). 
Thus, <P is irredundant if and only if LMES(^) = {A(<5)}. Also, note that a label might 
be redundant in <P, but irredundant in a subformula of <P. However, if I is irredundant 
in it is irredundant in every subformula of <£>. 

Clearly, every labelled CNF formula <P has at least one LMES, and, furthermore, 
for any subformula <P' of <P, $' = <P if and only if some LMES of <P is a subset of 
A(<£'). Note that in case of CNF formulas, an MES can be empty only if the formula 
itself is empty. For the case of labelled CNFs, an empty LMES can also occur when all 
labels are redundant — but this can only happen in the presence of unlabelled clauses. 
Note that this additional case is not an artifact of the LCNF framework, but rather the 
artifact of the idea of group-0 clauses (in group-CNFs), and uninteresting variables (in 
the extensions of variable-MUSes). For example, group-MUS is empty when group-0 is 
unsatisfiable. Table [XT] contains a summary of this and other corner cases in the LCNF 
framework, and contrasts them with the corner cases in (plain) CNF redundancy. 

Example 3. Consider the LCNF formula <P from ExampleQ] for convenience we repro- 
duce it here. 

c i = i^y){i} c 3 = (z v t) {1} c 5 = (x v y v z) c 7 = (-■y v t) {3} 

C2 = (y v -^){i} c 4 = (--£){i,2} c 6 = (-.a; v y){ 2 ,3} c s = (^i){4} 

To aid the understanding of the example note the following: the clauses ci, . . . , Gj are 
implied by the clauses C5, . . . , c& (ci is derived from C7, cs by resolution; C2 is sub- 
sumed by cs; C3 is derived from C5, eg, C7; C4 is derived from ce, C7, eg); also, the clauses 
C6, Or, c§ are implied by the clauses ci, C2, c\ (cq is subsumed by C4; C7 is subsumed by 
c\\ c§ is derived from c%, c 2 ). 

Label 1 is redundant in <P due to the fact that clauses T 1 = {ci, . . . , C4} are implied 
by ^"|{2,3,4} = {°5: ■ ■ ■ 7 C 8}- However, labels 2, 3 and 4 are irredundant in ^| {2,3.4}' 



hence L x = {2, 3, 4} is a labelled MES of <£. The formula <P has another LMES: label 3 
is redundant in <P, as clauses J 73 = {c 6 , c-j) are implied by J : \{i_2,i} = ( c i, ■ • ■ , c.s, c§}. 
However, ^1(1,2.4} contains a redundant label 4, as clause c% is implied by ci, c-2- Now, 
^l{i.2} = ({ c ii ■ • ■ i c 5}i A) is irredundant — even though clause C5 is implied by C2 
and C3 and so is redundant in the (plain) CNF sense, we cannot remove it from ^{1,2}; 
note that this would also be the case if A(cs) = {2}. We conclude that L 2 = {1, 2} is 
an LMES of 

The notion dual to minimal equivalence (resp. minimal unsatisfiability) is that of 
maximal non-equivalence (resp. maximal satisfiability). Here we are interested in sets 
of labels that induce a subformula of <P that is not equivalent to <P, but an addition of 
any active label from <P, results in an equivalent subformula. 

Definition 9 (Labelled Maximal Non-equivalent Subset (LMNS)). Let $ = (J 7 , A> 

be a labelled CNF formula. A set of labels L c A(^) is a labelled maximal non- 
equivalent subset (LMNS) of<l>, if <P\l # $ and for every L', L cz L' cz A($), <P\ L , = 
<P. The set of all LMNSes of$ is denoted by LMNS(^). 

Note that just as with clausal MNSes, which do not exist for empty formulas because 
every subformula of an empty formula is equivalent to it, LMNSes do not exist for 
LCNF formulas with A(^) = 0. Also, just as with LMESes, the presence of unla- 
beled clauses gives rise to an additional corner case (see also Table I3.lt — when all 
labels are redundant (for non-empty formulas this can only happen if J 7 ® # 0), every 
subformula of <P is also equivalent to <P. For the case of unsatisfiable LCNFs, we have a 
definition analogous to that of (clausal) MSS. 

Definition 10 (Labelled Maximal Satisfiable Subset (LMSS)). Let <P = (J 7 , A> be a 

labelled CNF formula. A set of labels L c \(<P) is a labelled maximal satisfiable subset 
(LMSS) of$, if$\ L e SAT and for every L', L c V c A(<2>), $\ L > e UNSAT. The set 
of all LMSSes of$ is denoted by LMSS(^). 

Note that as opposed to MSSes, which exist for every CNF formula, LMSSes do not 
exist for formulas with an unsatisfiable set of unlabelled clauses, because no subformula 
of such a formula is satisfiable. 

As discussed in Section [2] clausal MSSes are of interest for a number of reasons, 
one of which that an MSS of maximum cardinality is a set of clauses that are true under 
a solution to MaxSAT problem. With this in mind we can also define a generalized 
version of MaxSAT problem. 

Given an LMSS L of <P, one may also consider its complement \(<P)\L. When 
<P e SAT, the complement is an empty set, however when <P e UNSAT, A(^)\L is 
a minimal set of labels of <P, removal of which from <P, will regain the satisfiability. 
The corresponding concept in the context of unsatisfiable CNF is that of co-MSS (cf. 
Section[2]). Similar, though less intuitive, concept arises in the case of LMNSes. 

Definition 11 (co-LMNS). Let <P = (J 7 , A) be a labelled CNF formula. A set of labels 
L E A(<2>) is a labelled co-MNS (co-LMNS) of<P, if\{$)\L e LMNS(<Z>). Or, explicitly, 
if<P\r<p)\L # and for any L' cz L, ^a(*)\L' = ^- The set of all co-LMNSes of>P is 
denoted by coLMNS(^). 



Definition 12 (co-LMSS). Let <P = (J 7 , X) be a labelled CNF formula. A set of labels 
L c A(<2>) is a labelled co-MSS (co-LMSS) of<P, if\($)\L e LMSS(<2>). Or, explicitly, 
if$\(&)\L e SAT, and for any L' cz L, $x(&)\l> e UNSAT. The set of all co-LMSSes 
of<f> is denoted by coLMSS(^). 

Example 4. Consider again the LCNF formula <P from Example Q] The formula has 
three LMNSes: {1, 3, 4}, {2, 3} and {2, 4}, and three corresponding co-LMNSes. 

3.3 Generalized Hitting Set Duality 

As mentioned in Section|2] for a given CNF formula J 7 , there is a relationship between 
the set of MUSes of J 7 and the set of co-MSSes of T\ coMSS(J r ) is a set of irre- 
ducible hitting sets of MUS(J r ). This relationship has been (re)discovered on a number 
of occasions, with the earliest, to our knowledge, attributed to Reiter 11321 in the con- 
text of model-based diagnosis — there MUSes are called minimal conflict sets, and 
coMSSes are called minimal diagnoses. This relationship is a basis for the efficient 
MUS enumeration algorithms (cf. 11212.71 . A weaker form of this relationship, namely 
(J MUS (J 7 ) = T\ H MSS(7 7 ), derived by Kullmann J20), has been also generalized in 
ETI to the case of satisfiable CNF formulas. In this section we develop a general version 
of the hitting set theorem for the labelled CNF formulas. In addition to subsuming the 
previous results, the theorem covers all the other, not previously analyzed, cases, e.g. 
group-MUS or variable-MUS. The theorem also allows to develop effective algorithm 
computation of the set of all LMESes. 

The proof of the theorem relies on a number of basic properties of LMESes and 
LMNSes, as well as the following known property of irreducible hitting sets (recall 
Definition |2j. The property asserts that every element of an irreducible hitting set must, 
in a sense, have a "reason" to be there, i.e. to be a unique representative of some set. 

Proposition 1. Let be a collection of arbitrary sets, and let H be any hitting set of 
5f . Then, H is irreducible if and only if^h e H, 3S e 5f such that H n S = {h}. 

The hitting sets relationship is captured formally by the following theorem. 

Theorem 2 (Generalized Hitting Set Duality Theorem). Let & = (J 7 , X) be a la- 
belled CNF formula, such that A(<£) # 0, and if J 7 ® ¥= then at least one label in 
A(<£) is irredundant. Then, 

(i) L c A(^) is a coLMNS of <P if and only if L is an irreducible hitting set of 
LMES(<2>). 

( ii) L c \(<P) is an LMES of<P if and only if L is an irreducible hitting set o/coLM NS(#). 

Note that the restrictions on the formula <P in the above theorem are in place to 
ensure that the formula has at least one co-LMNS (cf. Table l3.ll ). These restrictions are 
satisfied a priori for a number of special cases, which we discuss shortly. 

The intuition behind ( i) can be explained as follow^ — since the removal of a co- 
LMNS from a formula <P makes it non-equivalent to <P, the removal must "break" each 



3 This explanation is a generalized version of the one given for unsatisfiable CNF case in 1271 



of the LMESes of the formula. Hence a co-LMNS must include at least one label from 
each of the LMESes, i.e. it is a hitting set of the set of LMESes of the formula. The 
minimality of co-LMNS implies the irreducibility of the hitting set, and vice versa. 

Before we proceed with the proof of Theorem [2] recall a simple property of sub- 
formulas of any LCNF formula <P that satisfies the conditions of the theorem: for any 
<P' c <P, <P' if and only if \(<P') is a subset of some LMNS of <P; <P' = <P if and 
only if A($') is a superset of some LMES of <P. 

Proof. For clarity we adopt the following convention: letter S will be used to denote 
LMNSes, M to denote co-LMNSes, U to denote LMESes. 

Part (i), If: Let M be an irreducible hitting set of LMES(^), and let S = A(#)\M. 
First, since M is a hitting set of LMES(^), S cannot include an LMES of <P, and so 
<P\s # Since M is an irreducible hitting set of LMES(^), for any label I e M, there 
exists U e LMES(^), such that M n U = {1} (by Proposition!]) ■ Hence, for any I e M, 
the set S u {1} includes some LMES U of <P, and so &\su{i] = We conclude that S 
is an LMNS of <P, and so M is a co-LMNS of 

Part (i), Only-if: Let M be any co-LMNS of <P, and let S = \(<P)\M be the cor- 
responding LMNS. Since <5| s # <P, for any U e LMES(^), U\S # (otherwise 
U Q S), and so U n M ^ 0, that is, M is a hitting set of LMES(^). Now, since S is 
an LMNS, for every label I e M, ^|su{z} = ^. Thus, for every I e M, there exists an 
LMES U such that M r\U = {/}. By Proposition!]] M is an irreducible hitting set of 
LMES(<P). 

Part (»'), 7/:- Let U be an irreducible hitting set of coLMNS(<£). We have that for 
any M e coLMNS(0), U r, M # 0. Hence, for no S 1 e LMNS(<Z>) we have U ^ S 
and so <P\jj = ^. Since t/ is irreducible, by Proposition Q] for every label I e U, there 
exists M e coLMNS(^) such that U n M = {I}. Thus, for every I e E7, there exists a 
co-LMNS M such that U' = U\{1} c A(<2>)\M, i.e. {/' is included in some LMNS of 
and so $|;y> # ^. We conclude that U e LMES(#). 

Part (ii), Only-if: Let U be any LMES of Since #1;/ = U cannot be included 
in any LMNS of @, and so for every co-LMNS M of <P, we have U n M ^ 0, i.e. [/ 
is a hitting set of coLMNS(0). Now, since [/ is an LMES of for any label I e U, 
<&\u\{i] ^ and so the set U\{1} is included in some LMNS of Hence, for any 
label I e U, there exists a co-LMNS M of such that U n M = {I}. Hence, By 
Proposition!]] U is an irreducible hitting set of coLMNS(^). □ 

The restrictions on the formula <P in Theorem [2] can, in some cases, be satisfied 
a priori. Consider, for example, the case <P e UNSAT, and the labelling function as 
in Example 01). Since Tg, e UNSAT, we have T ^ 0, and every clause is labelled 
(J 7 * 2 = 0), the theorem applies unconditionally to such formulas. Thus, we get exactly 
the original version of hitting set duality theorem for unsatisfiable CNF formulas (see 
Section |2j. For the case of group-MUS (Example |2n)), the theorem holds whenever 
e SAT, as this condition ensures that the formula has at least one irredundant label 
(sincere UNSAT). 

The following corollary is a straightforward consequence of Theorem |2j and is a 
generalized version of the relationship between MUSes and co-MSSes shown in |fl9l . 

Corollary 1. Let <P be as in Theorem^ Then, \J LMES(<2>) = A(#)\ fl LMNS(<Z>). 



The following example illustrates the claims of Theorem[2]and Corollary Q] 

Example 5. Consider the LCNF formula <P from ExampleQ] From Examples[3]and[3]we 
have the following: LMES(<£) = {{1,2}, {2,3,4}}, LMNS(<P) = {{1, 3, 4}, {2, 3}, {2, 4}}, 
coLMNS(<2>) = {{2}, {1, 3}, {1, 4}}. Note that LMES(<£) has exactly 3 irreducible hit- 
ting sets that constitute the set coLMNS(^). Also, \J LMES(^) = {1, 2, 3, 4} = A(<£), 
andflLMNS(<?) = 0. 

4 Conclusion 

This report presents a framework of labelled CNF formulas that allows to generalize 
and extend the existing work on redundancy detection and removal in CNF formulas. 
Future work includes the development of a number of additional theoretical results, 
and a suite of efficient algorithms that address various computational problems in the 
context of the proposed framework. 
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